Using Generalized Learning Automata for State Space Aggregation in MAS

نویسندگان

  • Yann-Michaël De Hauwere
  • Peter Vrancx
  • Ann Nowé
چکیده

A key problem in multi-agent reinforcement learning remains dealing with the large state spaces typically associated with realistic distributed agent systems. As the state space grows, agent policies become more and more complex and learning slows. One possible solution for an agent to continue learning in these large-scale systems is to learn a policy which generalizes over states, rather than trying to map each individual state to an action. In this paper we present a multi-agent learning approach capable of aggregating states, using simple reinforcement learners called learning automata (LA). Independent Learning automata have already been shown to perform well in multi-agent environments. Previously we proposed LA based multi-agent algorithms capable of finding a Nash Equilibrium between agent policies. In these algorithms, however, one LA per agent is associated with each system state, as such the approach is limited to discrete state spaces. Furthermore, when the number of states increases, the number of automata also increases and the learning speed of the system slows down. To deal with this problem, we propose to use Generalized Learning Automata (GLA), which are capable of identifying regions within the state space with the same optimal action, and as such aggregating states. We analyze the behaviour of GLA in a multi-agent setting and demonstrate results on a set of sample problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs

Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...

متن کامل

Improved Frog Leaping Algorithm Using Cellular Learning Automata

In this paper, a new algorithm which is the result of the combination of cellular learning automata and frog leap algorithm (SFLA) is proposed for optimization in continuous, static environments.At the proposed algorithm, each memeplex of frogs is placed in a cell of cellular learning automata. Learning automata in each cell acts as the brain of memeplex, and will determine the strategy of moti...

متن کامل

Reduction of Computational Complexity in Finite State Automata Explosion of Networked System Diagnosis (RESEARCH NOTE)

This research puts forward rough finite state automata which have been represented by two variants of BDD called ROBDD and ZBDD. The proposed structures have been used in networked system diagnosis and can overcome cominatorial explosion. In implementation the CUDD - Colorado University Decision Diagrams package is used. A mathematical proof for claimed complexity are provided which shows ZBDD ...

متن کامل

Optimizing Membership Functions using Learning Automata for Fuzzy Association Rule Mining

The Transactions in web data often consist of quantitative data, suggesting that fuzzy set theory can be used to represent such data. The time spent by users on each web page is one type of web data, was regarded as a trapezoidal membership function (TMF) and can be used to evaluate user browsing behavior. The quality of mining fuzzy association rules depends on membership functions and since t...

متن کامل

Generalized learning automata for multi-agent reinforcement learning

A major challenge in multi-agent reinforcement learning remains dealing with the large state spaces typically associated with realistic multi-agent systems. As the state space grows, agent policies become increasingly complex and learning slows down. Currently, advanced single-agent techniques are already very capable of learning optimal policies in large unknown environments. When multiple age...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008